Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 28
Filter
1.
Engineering Letters ; 31(2):813-819, 2023.
Article in English | Scopus | ID: covidwho-20245156

ABSTRACT

The COVID-19 pandemic has hit hard the Indonesian economy. Many businesses had to close because they could not cover operational costs, and many workers were laid off creating an unemployment crisis. Unemployment causes people's productivity and income to decrease, leading to poverty and other social problems, making it a crucial problem and great concern for the nation. Economic conditions during this pandemic have also provided an unusual pattern in economic data, in which outliers may occur, leading to biased parameter estimation results. For that reason, it is necessary to deal with outliers in research data appropriately. This study aims to find within-group estimators for unbalanced panel data regression model of the Open Unemployment Rate (OUR) in East Kalimantan Province and the factors that influence it. The method used is the within transformation with mean centering and median centering processing methods. The results of this study may provide advice on factors that can increase and decrease the OUR of East Kalimantan Province. The results show that the best model for estimating OUR data in East Kalimantan Province is the within-transformation estimation method using median centering. According to the best model, the Human Development Index (HDI) and Gross Regional Domestic Product (GRDP) are two factors that influence the OUR of East Kalimantan Province (GRDP). © 2023, International Association of Engineers. All rights reserved.

2.
International Journal of Business Intelligence and Data Mining ; 22(3):287-309, 2023.
Article in English | Scopus | ID: covidwho-2314087

ABSTRACT

Outlier is a value that lies outside most of the other values in a dataset. Outlier exploration has a huge importance in almost all the industry applications like medical diagnosis, credit card fraudulence and intrusion detection systems. Similarly, in economic domain, it can be applied to analyse many unexpected events to harvest new knowledge like sudden crash of stock market, mismatch between country's per capita incomes and overall development, abrupt change in unemployment rate and steep falling of bank interest. These situations can arise due to several reasons, out of which the present COVID-19 pandemic is a leading one. This motivates the present researchers to identify a few such vulnerable areas in the economic sphere and ferret out the most affected countries for each of them. Two well-known machine-learning techniques DBSCAN and Z-score are utilised to get these insights, which can serve as a guideline towards improving the overall scenario subsequently. Copyright © 2023 Inderscience Enterprises Ltd.

3.
Acm Journal of Data and Information Quality ; 15(1), 2023.
Article in English | Web of Science | ID: covidwho-2310881

ABSTRACT

Much of today's data are represented as graphs, ranging from social networks to bibliographic citations. Nodes in such graphs correspond to records that generally represent entities, while edges represent relationships between these entities. Both nodes and edges in a graph can have attributes that characterize the entities and their relationships. Relationships are either explicitly known ( like friends in a social network), or they are inferred using link prediction (such as two babies are siblings because they have the same mother). Any graph representing real-world data likely contains nodes and edges that are abnormal, and identifying these can be important for outlier detection in applications ranging from crime and fraud detection to viral marketing. We propose a novel approach to the unsupervised detection of abnormal nodes and edges in graphs. We first characterize nodes and edges using a set of features, and then employ a one-class classifier to identify abnormal nodes and edges. We extract patterns of features from these abnormal nodes and edges, and apply clustering to identify groups of patterns with similar characteristics. We finally visualize these abnormal patterns to show co-occurrences of features and relationships between those features that mostly influence the abnormality of nodes and edges. We evaluate our approach on datasets from diverse domains, including historical birth certificates, COVID patient records, e-mails, books, and movies. This evaluation demonstrates that our approach is well suited to identify both abnormal nodes and edges in graphs in an unsupervised way, and it can outperform several baseline anomaly detection techniques.

4.
Lecture Notes on Data Engineering and Communications Technologies ; 165:465-479, 2023.
Article in English | Scopus | ID: covidwho-2296443

ABSTRACT

Classical statistics are usually based on parametric models, where the performance depends heavily on assumptions and is not robust in the presence of outliers in the data. Due to the COVID-19 pandemic, our daily lives have changed significantly, including slowing economic growth. These extreme changes can manifest as an outlier in time series studies and adversely affect the results of data analysis. Many classical methods of official statistics are prone to outliers. In this work, we evaluate machine learning methods: Support Vector Regression (SVR) and Random Forest (RF) and compare it with ARIMA to determine the robustness through simulation studies. Robustness is measured by the sensitivity of the SVR and Random Forest hyperparameter and the model's error in the presence of outliers. Simulations show that more outliers lead to higher RMSE values, and conversely, more samples lead to lower RMSE values. The type of outliers significantly impacts the RMSE value of the ARIMA model, where additional outliers (AO) have a worse impact than temporary change (TC). Consecutive outliers produce a smaller RMSE mean than non-consecutive outliers. Based on the sensitivity of hyperparameters, SVR and Random Forest models are relatively robust to the presence of outliers in the data. Based on the simulation results of 100 iterations, we find that SVR is more robust than ARIMA and Random Forest in modeling time series data with outliers. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

5.
Journal of Data and Information Quality ; 15(1), 2022.
Article in English | Scopus | ID: covidwho-2280499

ABSTRACT

Much of today's data are represented as graphs, ranging from social networks to bibliographic citations. Nodes in such graphs correspond to records that generally represent entities, while edges represent relationships between these entities. Both nodes and edges in a graph can have attributes that characterize the entities and their relationships. Relationships are either explicitly known (like friends in a social network), or they are inferred using link prediction (such as two babies are siblings because they have the same mother). Any graph representing real-world data likely contains nodes and edges that are abnormal, and identifying these can be important for outlier detection in applications ranging from crime and fraud detection to viral marketing. We propose a novel approach to the unsupervised detection of abnormal nodes and edges in graphs. We first characterize nodes and edges using a set of features, and then employ a one-class classifier to identify abnormal nodes and edges. We extract patterns of features from these abnormal nodes and edges, and apply clustering to identify groups of patterns with similar characteristics. We finally visualize these abnormal patterns to show co-occurrences of features and relationships between those features that mostly influence the abnormality of nodes and edges. We evaluate our approach on datasets from diverse domains, including historical birth certificates, COVID patient records, e-mails, books, and movies. This evaluation demonstrates that our approach is well suited to identify both abnormal nodes and edges in graphs in an unsupervised way, and it can outperform several baseline anomaly detection techniques. © 2022 Copyright held by the owner/author(s).

6.
Int J Environ Res Public Health ; 20(5)2023 02 28.
Article in English | MEDLINE | ID: covidwho-2254578

ABSTRACT

In the last few years, many types of research have been conducted on the most harmful pandemic, COVID-19. Machine learning approaches have been applied to investigate chest X-rays of COVID-19 patients in many respects. This study focuses on the deep learning algorithm from the standpoint of feature space and similarity analysis. Firstly, we utilized Local Interpretable Model-agnostic Explanations (LIME) to justify the necessity of the region of interest (ROI) process and further prepared ROI via U-Net segmentation that masked out non-lung areas of images to prevent the classifier from being distracted by irrelevant features. The experimental results were promising, with detection performance reaching an overall accuracy of 95.5%, a sensitivity of 98.4%, a precision of 94.7%, and an F1 score of 96.5% on the COVID-19 category. Secondly, we applied similarity analysis to identify outliers and further provided an objective confidence reference specific to the similarity distance to centers or boundaries of clusters while inferring. Finally, the experimental results suggested putting more effort into enhancing the low-accuracy subspace locally, which is identified by the similarity distance to the centers. The experimental results were promising, and based on those perspectives, our approach could be more flexible to deploy dedicated classifiers specific to different subspaces instead of one rigid end-to-end black box model for all feature space.


Subject(s)
COVID-19 , Datasets as Topic , Deep Learning , X-Rays , Humans , Algorithms , Mass Chest X-Ray
7.
BMC Public Health ; 23(1): 148, 2023 Jan 21.
Article in English | MEDLINE | ID: covidwho-2232617

ABSTRACT

BACKGROUND: One of the seminal events since 2019 has been the outbreak of the SARS-CoV-2 pandemic. Countries have adopted various policies to deal with it, but they also differ in their socio-geographical characteristics and public health care facilities. Our study aimed to investigate differences between epidemiological parameters across countries. METHOD: The analysed data represents SARS-CoV-2 repository provided by the Johns Hopkins University. Separately for each country, we estimated recovery and mortality rates using the SIRD model applied to the first 30, 60, 150, and 300 days of the pandemic. Moreover, a mixture of normal distributions was fitted to the number of confirmed cases and deaths during the first 300 days. The estimates of peaks' means and variances were used to identify countries with outlying parameters. RESULTS: For 300 days Belgium, Cyprus, France, the Netherlands, Serbia, and the UK were classified as outliers by all three outlier detection methods. Yemen was classified as an outlier for each of the four considered timeframes, due to high mortality rates. During the first 300 days of the pandemic, the majority of countries underwent three peaks in the number of confirmed cases, except Australia and Kazakhstan with two peaks. CONCLUSIONS: Considering recovery and mortality rates we observed heterogeneity between countries. Liechtenstein was the "positive" outlier with low mortality rates and high recovery rates, at the opposite, Yemen represented a "negative" outlier with high mortality for all four considered periods and low recovery for 30 and 60 days.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , COVID-19/epidemiology , Pandemics , Disease Outbreaks , France
8.
Metrologia ; 60(1 A), 2023.
Article in English | Scopus | ID: covidwho-2222529

ABSTRACT

The EURAMET.EM-S44 comparison was performed with 7 participants. The objective of this comparison is to provide technical evidence supporting their CMCs entries of those participants who did not participate in the EURAMET.EM-S24, while other participants would have an evidence for confirmation of their improvements in this field of measurement. Besides, ±9.5 fA measurements measurements have been performed in addition to the measurement points of EURAMET.EM-S24. A commercial electrometer Keithley 6430 was used as travelling standard. A linear drift of the travelling standard was observed at ±95 pA, ±9.5 pA and ±0.95 pA values.Therefore, the effect of the linear drift was eliminated for the evalution of the measurement results. The comparison reference value has been determined based on the weighted mean of participant results that survived outlier detection. The discarded results have been showed in the tables in "Measurement Results" section. Because of the Covid-19 pandemic conditions, some delays in customs and in sending the participants reports, the comparison has taken longer time period than expected. © 2023 Institute of Physics Publishing. All rights reserved.

9.
Engineering Proceedings ; 18(1), 2022.
Article in English | Scopus | ID: covidwho-2199945

ABSTRACT

It is no longer possible to imagine our everyday life without time series data. This includes, for example, market developments, COVID-19 cases, electricity prices, and other data from a wide variety of domains. An important task in the analysis of these data is the detection of anomalies. In most cases, this is accomplished by examining individual time series. In our work, we use the techniques of cluster analysis to establish a relationship between time series and groups of time series. This relationship allows us to observe the development of time series in their entirety, thereby gaining additional insights. Our approach identifies outliers with a real-world reference and enables the user to locate outliers without prior knowledge. To underline the strengths of our approach, we compare our method with another known method on two real-world datasets. We found that our solution needs significantly fewer calculations, produces more reasonable results, and can be applied to real-time data. Moreover, our method detected additional outliers, whose occurrence could be explained by real events. © 2022 by the authors.

10.
Studies in Nonlinear Dynamics and Econometrics ; 2022.
Article in English | Web of Science | ID: covidwho-2197374

ABSTRACT

This paper investigates the ability of several generalized Bayesian vector autoregressions to cope with the extreme COVID-19 observations and discusses their impact on prior calibration for inference and forecasting purposes. It shows that the preferred model interprets the pandemic episode as a rare event rather than a persistent increase in macroeconomic volatility. For forecasting, the choice among outlier-robust error structures is less important, however, when a large cross-section of information is used. Besides the error structure, this paper shows that the standard Minnesota prior calibration is an important source of changing macroeconomic transmission channels during the pandemic, altering the predictability of real and nominal variables. To alleviate this sensitivity, an outlier-robust prior calibration is proposed.

11.
BMC Bioinformatics ; 23(1): 547, 2022 Dec 19.
Article in English | MEDLINE | ID: covidwho-2196036

ABSTRACT

As of June 2022, the GISAID database contains more than 11 million SARS-CoV-2 genomes, including several thousand nucleotide sequences for the most common variants such as delta or omicron. These SARS-CoV-2 strains have been collected from patients around the world since the beginning of the pandemic. We start by assessing the similarity of all pairs of nucleotide sequences using the Jaccard index and principal component analysis. As shown previously in the literature, an unsupervised cluster analysis applied to the SARS-CoV-2 genomes results in clusters of sequences according to certain characteristics such as their strain or their clade. Importantly, we observe that nucleotide sequences of common variants are often outliers in clusters of sequences stemming from variants identified earlier on during the pandemic. Motivated by this finding, we are interested in applying outlier detection to nucleotide sequences. We demonstrate that nucleotide sequences of common variants (such as alpha, delta, or omicron) can be identified solely based on a statistical outlier criterion. We argue that outlier detection might be a useful surveillance tool to identify emerging variants in real time as the pandemic progresses.


Subject(s)
COVID-19 , Humans , Base Sequence , SARS-CoV-2 , Cluster Analysis , Databases, Factual
12.
Cancers (Basel) ; 14(24)2022 Dec 18.
Article in English | MEDLINE | ID: covidwho-2163250

ABSTRACT

We investigated lung-heart toxicity and mortality in 123 women with stage I-II breast cancer enrolled in 2007-2011 in a prospective trial of adjuvant radiotherapy (TomoBreast). We were concerned whether the COVID-19 pandemic affected the outcomes. All patients were analyzed as a single cohort. Lung-heart status was reverse-scored as freedom from adverse-events (fAE) on a 1-5 scale. Left ventricular ejection fraction (LVEF) and pulmonary function tests were untransformed. Statistical analyses applied least-square regression to calendar-year aggregated data. The significance of outliers was determined using the Dixon and the Grubbs corrected tests. At 12.0 years median follow-up, 103 patients remained alive; 10-years overall survival was 87.8%. In 2007-2019, 15 patients died, of whom 11 were cancer-related deaths. In 2020, five patients died, none of whom from cancer. fAE and lung-heart function declined gradually over a decade through 2019, but deteriorated markedly in 2020: fAE dipped significantly from 4.6-4.6 to 4.3-4.2; LVEF dipped to 58.4% versus the expected 60.3% (PDixon = 0.021, PGrubbs = 0.054); forced vital capacity dipped to 2.4 L vs. 2.6 L (PDixon = 0.043, PGrubbs = 0.181); carbon-monoxide diffusing capacity dipped to 12.6 mL/min/mmHg vs. 15.2 (PDixon = 0.008, PGrubbs = 0.006). In conclusion, excess non-cancer mortality was observed in 2020. Deaths in that year totaled one-third of the deaths in the previous decade, and revealed observable lung-heart deterioration.

13.
Journal of Forecasting ; 2022.
Article in English | Scopus | ID: covidwho-2148304

ABSTRACT

Several procedures to forecast daily risk measures in cryptocurrency markets have been recently implemented in the literature. Among them, long-memory processes, procedures taking into account the presence of extreme observations, procedures that include more than a single regime, and quantile regression-based models have performed substantially better than standard methods in terms of forecasting risk measures. Those procedures are revisited in this paper, and their value at risk and expected shortfall forecasting performance are evaluated using recent Bitcoin and Ethereum data that include periods of turbulence due to the COVID-19 pandemic, the third halving of Bitcoin, and the Lexia class action. Additionally, in order to mitigate the influence of model misspecification and enhance the forecasting performance obtained by individual models, we evaluate the use of several forecast combining strategies. Our results, based on a comprehensive backtesting exercise, reveal that, for Bitcoin, there is no single procedure outperforming all other models, but for Ethereum, there is evidence showing that the GAS model is a suitable alternative for forecasting both risk measures. We found that the combining methods were not able to outperform the better of the individual models. © 2022 John Wiley & Sons Ltd.

14.
22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2022 ; : 905-912, 2022.
Article in English | Scopus | ID: covidwho-1992575

ABSTRACT

Social isolation is a serious public health issue that can lead to various mental and physical health problems for individuals and jeopardizes their life's quality. The issue is more critical for older adults and palliative patients who are already suffering from different diseases and lack some abilities for performing their daily tasks. Additionally, this situation worsens when the COVID-19 pandemic adds forced social isolation to people's lives worldwide. In this paper, we propose a framework for detecting social isolation in community-based palliative care networks. We look at the problem as an outlier detection in community-based social graphs. Hence, we map the network to an attributed weighted social graph. Consequently, each patient is linked to a set of informal and formal care providers. We define formulae and indices to extract the norm of the society in terms of structural connections and assign a value to each individual based on the quality and quantity of its connections. The structural indices and a set of quality of life features such as age, marital status, life satisfaction, and capabilities are then used to identify the isolated individuals. We analyze and evaluate the performance of our algorithm on real-life data obtained by the Windsor Essex Compassion Care Community (WECCC), as well as various synthetic social graphs. © 2022 IEEE.

15.
IEEE Transactions on Signal Processing ; 70:2859-2868, 2022.
Article in English | Academic Search Complete | ID: covidwho-1901511

ABSTRACT

Daily pandemic surveillance, often achieved through the estimation of the reproduction number, constitutes a critical challenge for national health authorities to design counter-measures. In an earlier work, we proposed to formulate the estimation of the reproduction number as an optimization problem, combining data-model fidelity and space-time regularity constraints, solved by nonsmooth convex proximal minimizations. Though promising, that first formulation significantly lacks robustness against the Covid-19 data low quality (irrelevant or missing counts, pseudo-seasonalities,...) stemming from the emergency and crisis context, which significantly impairs accurate pandemic evolution assessments. The present work aims to overcome these limitations by carefully crafting a functional permitting to estimate jointly, in a single step, the reproduction number and outliers defined to model low quality data. This functional also enforces epidemiology-driven regularity properties for the reproduction number estimates, while preserving convexity, thus permitting the design of efficient minimization algorithms, based on proximity operators that are derived analytically. The explicit convergence of the proposed algorithm is proven theoretically. Its relevance is quantified on real Covid-19 data, consisting of daily new infection counts for 200+ countries and for the 96 metropolitan France counties, publicly available at Johns Hopkins University and Santé-Publique-France. The procedure permits automated daily updates of these estimates, reported via animated and interactive maps. Open-source estimation procedures will be made publicly available. [ FROM AUTHOR] Copyright of IEEE Transactions on Signal Processing is the property of IEEE and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full . (Copyright applies to all s.)

16.
9th International Conference on Frontiers in Intelligent Computing: Theory and Applications, FICTA 2021 ; 267:429-439, 2022.
Article in English | Scopus | ID: covidwho-1844314

ABSTRACT

Outliers, or outlying observations, are values in data, which appear unusual. It is quite essential to analyze various unexpected events or anomalies in economic domain like sudden crash of stock market, mismatch between country’s per capita incomes and overall development, abrupt change in unemployment rate and steep falling of bank interest to find the insights for the benefit of humankind. These situations can arise due to several reasons, out of which pandemic is a major one. The present COVID-19 pandemic also disrupted the global economy largely as various countries faced various types of difficulties. This motivates the present researchers to identify a few such difficult areas in economic domain, arises due to the pandemic situation and identify the countries, which are affected most under each bucket. Two well-known machine-learning techniques DBSCAN (density based clustering approach) and Z-score (statistical technique) are utilized in this analysis. The results can be used as suggestive measures to the administrative bodies, which show the effectiveness of the study. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

17.
International Journal of Nonlinear Analysis and Applications ; 13(1):2971-2983, 2022.
Article in English | Web of Science | ID: covidwho-1811863

ABSTRACT

Survival functions estimators can be affected by outlier, and thus these estimations move away from their real values, especially with the increasing in the outlier ratios within the sample of the random variable. The research included a comparison of a number of Bayesian methods for the estimations of survival functions of burr-X distribution with the percentages of different outliers within the sample. Simulation results showed the effect of the estimation methods by sample size and the percentage of outliers, and the real values of the parameters distribution. Mean square error was adopted as a measure to compare the estimation methods with a number of simulation experiments. The research also included a case study of Covid-19 for practical application. Other estimation methods can be taken (maximum likelihood estimation method, moment method, and shrinkage method) to note the possibility of being affected by outlier values

18.
Int J Inj Contr Saf Promot ; 29(3): 382-393, 2022 Sep.
Article in English | MEDLINE | ID: covidwho-1769046

ABSTRACT

Little is known about the effect of the COVID-19 on road safety indicators (RSIs) in developing countries, and conducted studies provide limited information regarding this impact. These prompted the author to evaluate the impact of COVID-19 on RSIs in Turkey. RSIs and related indices of Turkey between 2016 and 2020 were collected. For evaluating the impact, RSIs whose 2020 measures differed significantly from the pre-COVID era were identified using the outlier detection technique and Regression analysis. K-means clustering was used to group RSIs according to their variation patterns in the study period. Results show that COVID-19 led to significant decreases in 26 RSIs, especially ones related to non-fatal road traffic injuries. COVID-19 resulted in a significant drop in road traffic crashes and related indices. Also, considerable changes in monthly and daily fatalities and injuries in 2020 were observed. Clustering results revealed that COVID-19 significantly impacts variation patterns of studied RSIs, especially ones related to non-fatal injuries. Clustering aided in identifying affected RSIs by COVID-19, which other used methods were unable to detect. COVID-19 led to significant changes in road safety indices in Turkey. Road authorities and researchers should be aware of these significant fluctuations in road safety data.


Subject(s)
COVID-19 , Wounds and Injuries , Accidents, Traffic , COVID-19/epidemiology , Humans , Turkey/epidemiology
19.
22nd International Conference on Artificial Intelligence in Education, AIED 2021 ; 12749 LNAI:290-295, 2021.
Article in English | Scopus | ID: covidwho-1767419

ABSTRACT

New ways to identify students in need of assistance are imperative to the evolution of online tutoring platforms. Currently implemented models to identify struggling students use costly and tedious classroom observation paired with student’s platform usage, and are often suitable for only a subset of students. With the recent influx of new students to online tutoring platforms due to COVID-19, a simple method to quickly identify struggling students could help facilitate effective remote learning. To this end, we created an anomaly detection algorithm that models the normal behavior of students during remote learning and recognizes when students deviate from this behavior. We demonstrated how anomalous behavior revealed which students needed additional assistance and predicted student learning outcomes. © 2021, Springer Nature Switzerland AG.

20.
Contributions to Economics ; : 495-513, 2022.
Article in English | Scopus | ID: covidwho-1661646

ABSTRACT

Annuity pricing is critical to the insurance companies for their financial liabilities. Companies aim to adjust the prices using a forecasting model that fits best to their historical data, which may have outliers influencing the model. Environmental conditions and extraordinary events such as a weak health system, an outbreak of war, and occurrence of pandemics like Spanish flu or Covid-19 may cause outliers resulting in misevaluation of mortality rates. These outliers should be taken into account to preserve the financial strength and liability of the life insurance industry. In this study, we aim to determine if there is an impact of mortality jumps in annuity pricing. We question the annuity price fluctuations among different countries and two models on country characteristics. Moreover, we show the annuity pricing on a portfolio for a more comprehensive assessment. To achieve this, a simulated diverse portfolio is created for the prices of four types of life annuities. Canada, Japan, and the United Kingdom as developed countries with high longevity risk, Russia and Bulgaria as emerging countries are considered. The results of this study prove the use of outlier-adjusted models for specific countries. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

SELECTION OF CITATIONS
SEARCH DETAIL